Model for Load Balancing on Processors in Parallel Mining of Frequent Itemsets

نویسنده

  • Ravindra Patel
چکیده

The existence of many large transactions distributed databases with high data schemas, the centralized approach for mining association rules in such databases will not be feasible. Some distributed algorithms have been developed [FDM, CD], but none of them have considered the problem of data skews in distributed mining of association rules. The skewness of datasets reduces the workload balancing between processors involved in distributed mining of association rules. It is important to invent an efficient approach for distributed mining of association rules which have the ability to generate homogeneous partitions of the whole data sets; hence the supports of most large item sets are distributed evenly across the processors. We proposed an efficient stratified sampling based partitioned technique, which generate homogeneous partitions on which processors works in parallel and generate their local concepts approximately simultaneously.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parallel Implementation of Apriori Algorithm

Association rule mining concept is used to show relation between items in a set of items. Apriori algorithm for mining frequent itemsets from large amount of database is used. Parallelism is used to reduce time and increase performance, Multi-core processor is used for parallelization. Mining in a Serial manner can consume time and reduce performance for mining. To solve this issue we are propo...

متن کامل

An Improved Technique Of Extracting Frequent Itemsets From Massive Data Using MapReduce

The mining of frequent itemsets is a basic and essential work in many data mining applications. Frequent itemsets extraction with frequent pattern and rules boosts the applications like Association rule mining, co-relations also in product sale and marketing. In extraction process of frequent itemsets there are number of algorithms used Like FP-growth,E-clat etc. But unfortunately these algorit...

متن کامل

Exploiting Parallelism in Association Rule Mining Algorithms

Association rule mining is one of the major technique of data mining, involves finding of frequent itemsets with minimum support and generating association rule among them with minimum confidence. The task of finding all frequent itemsets for a large datasets requires a lot of computation which can be minimized by exploiting parallelism to the sequential algorithms. In this paper, we provide th...

متن کامل

LPAS: High Efficiency Load Balancing Parallel Data Mining Algorithm

Association rule discovery plays an important role in knowledge discovery and data mining, and efficiency is especially crucial for an algorithm finding frequent itemsets from a large database. Many methods have been proposed to solve this problem. In addition, parallel computing has been a popular trend, such as on cloud platform, grid system or multicore platform. In this paper, a high effici...

متن کامل

Static Load Balancing of Parallel Mining of Frequent Itemsets Using Reservoir Sampling

In this paper, we present a novel method for parallelization of an arbitrary depth-first search (DFS in short) algorithm for mining of all FIs. The method is based on the so called reservoir sampling algorithm. The reservoir sampling algorithm in combination with an arbitrary DFS mining algorithm executed on a database sample takes an uniformly but not independently distributed sample of all FI...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012